-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix OOM on malformed input #99
Conversation
This commit make up for the lack of scala#35. Specifically, CDATA sections, comment, DTD.
Hey @shimamoto , Thank you for taking the time report and submit a PR. Please give me some time to let me go through this issue and also some others to respond back. We are on the verge of a release. If everything works fine, I will merge this PR and do the release. Thanks again. |
@biswanaths I am grateful for your prompt response. Then, I am looking forward to a good reply from you! |
@@ -247,7 +247,7 @@ private[scala] trait MarkupParserCommon extends TokenTests { | |||
while (true) { | |||
if (ch == head && peek(rest)) | |||
return handler(positioner(), sb.toString) | |||
else if (ch == SU) | |||
else if (ch == SU || eof) | |||
truncatedError("") // throws TruncatedXMLControl in compiler | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem like adding the eof
here triggers an OOM for any of your tests. I couldn't find an example that triggered the OOM for an XML processing instruction (PI) or a CDATA section -- they're the only ones that use xTakeUntil
. I guess it's ok to cargo-cult an eof
check everywhere, but it seems you were trying to avoid that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashawley Thank you for the review and comment! It is significant to add this eof
. This is necessary in test of CDATA section <broken><![CDATA[A
. Without it, the following OOM occur:
Exception in thread "XMLEventReader" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:3332)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:622)
at java.lang.StringBuilder.append(StringBuilder.java:202)
at scala.collection.mutable.StringBuilder.append(StringBuilder.scala:267)
at scala.xml.parsing.MarkupParserCommon$class.xTakeUntil(MarkupParserCommon.scala:253)
at scala.xml.pull.XMLEventReader$Parser.xTakeUntil(XMLEventReader.scala:60)
at scala.xml.parsing.MarkupParser$class.xCharData(MarkupParser.scala:379)
at scala.xml.pull.XMLEventReader$Parser.xCharData(XMLEventReader.scala:60)
at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:424)
at scala.xml.pull.XMLEventReader$Parser.content1(XMLEventReader.scala:60)
at scala.xml.parsing.MarkupParser$class.content(MarkupParser.scala:459)
at scala.xml.pull.XMLEventReader$Parser.content(XMLEventReader.scala:60)
at scala.xml.parsing.MarkupParser$class.element1(MarkupParser.scala:588)
at scala.xml.pull.XMLEventReader$Parser.element1(XMLEventReader.scala:60)
at scala.xml.parsing.MarkupParser$class.content1(MarkupParser.scala:433)
at scala.xml.pull.XMLEventReader$Parser.content1(XMLEventReader.scala:60)
at scala.xml.parsing.MarkupParser$class.document(MarkupParser.scala:247)
at scala.xml.pull.XMLEventReader$Parser.document(XMLEventReader.scala:60)
at scala.xml.pull.XMLEventReader$Parser$$anonfun$run$1.apply(XMLEventReader.scala:96)
at scala.xml.pull.XMLEventReader$Parser$$anonfun$run$1.apply(XMLEventReader.scala:96)
at scala.xml.pull.ProducerConsumerIterator$class.interruptibly(XMLEventReader.scala:125)
at scala.xml.pull.XMLEventReader.interruptibly(XMLEventReader.scala:27)
at scala.xml.pull.XMLEventReader$Parser.run(XMLEventReader.scala:96)
at java.lang.Thread.run(Thread.java:745)
It turned out that error checking is rudimentary, however, you are quite right. I fix some tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shimamoto Ok, cool! I thought that was odd. Thanks for fixing up the unit tests!
Changes look good to me! Thanks for cleaning up and refactoring the tests while you were at it. |
Verify to throw FatalError at a broken CDATA section and comment.
Fix unit tests of a broken CDATA section and comment since not enough the verification. Without this time fix, I confirmed that OOM occur in these cases. |
This looks good to me. If there is nothing else I will merge this tomorrow. |
Hi.
I tried in the latest master version since the bug (#35) have been merged.
But the survey found that there are still cases that OOM occur.
Specifically, CDATA sections, comment, DTD.
All of the above cases occur OOM.
I am in trouble since OOM have a large impact on the production.
This PR is going to avoid the OOM caused by the above-mentioned.
This commit make up for the lack of #35. Specifically,
CDATA sections, comment, DTD.